SYMMETRY IN SEMIDEFINITE PROGRAMS 



FRANK VALLENTIN 

ABSTRACT. This paper is a tutorial in a general and explicit procedure to sim- 
plify semidefinite programs which are invariant under the action of a symmetry 
group. The procedure is based on basic notions of representation theory of finite 
groups. As an example we derive the block diagonalization of the Terwilliger 
algebra of the binary Hamming scheme in this framework. Here its connection 
to the orthogonal Hahn and Krawtchouk polynomials becomes visible. 



1. Introduction 
A (complex) semidefinite program is an optimization problem of the form 

(1) max{(C, Y) : (A h Y) = b u % = 1, . . . , n, and Y t. 0}, 

where Ai € C XxX , and C G C XxX are given Hermitian matrices whose rows 
and columns are indexed by a finite set X, (bi, . . . , bnf G W 1 is a given vector 
and Y G C XxX is a variable Hermitian matrix and where "Y >z 0" means that 
Y is positive semidefinite. Here (C, Y) = trace(Cy) denotes the trace product 
between symmetric matrices. 

Semidefinite programming is an extension of linear programming and has a wide 
range of applications: combinatorial optimization and control theory are the most 
famous ones. Although semidefinite programming has an enormous expressive 
power in formulating convex optimization problems it has a few practical draw- 
backs: Highly robust and highly efficient solvers, unlike their counterparts for 
solving linear programs, are currently not available. So it is crucial to exploit the 
problems' structure to be able to perform computations. 

In the last years many results were obtained if the problem under consideration 
has symmetry. This was done for a variety of problems and applications: interior 
point algorithms (Kanno, Ohsaki, Murota, Katoh |[T6l and de Klerk, Pasechnik [5 ]), 
polynomial optimization (Parrilo, Gatermann [10] and Jansson, Lasserre, Riener, 
Theobald [14]), truss topology optimization (Bai, de Klerk, Pasechnik, Sotirov 
[3]), quadratic assignment (de Klerk, Sotirov [7]), fast mixing Markov chains on 
graphs (Boyd, Diaconis, Xiao H), graph coloring (Gvozdenovic, Laurent |[T3l ). 
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crossing numbers for complete binary graphs (de Klerk, Pasechnik, Schrijver @) 
and coding theory (Schrijver l20l . Gijswijt, Schrijver, Tanaka ifTTl and Laurent 

m.). 

In all these applications the underlying principles are similar: one simplifies the 
original semidefinite program which is invariant under a group action by apply- 
ing an algebra isomorphism mapping a "large" matrix algebra to a "small" matrix 
algebra. Then it is sufficient to solve the semidefinite program using the smaller 
matrices. The existence of an appropriate algebra isomorphism is a classical fact 
from Artin-Wedderburn theory. However, in the above mentioned papers the ex- 
plicit determination of an appropriate isomorphism is rather mysterious. The aim 
of this paper is to give an algorithmic way to do this which also is well-suited for 
symbolic calculations by hand. 

The paper is structured as follows: Section [2]recails basic definitions and shows 
how the Artin-Wedderburn theorem stated in (0]) can be applied to simplify a semi- 
definite program invariant under a group action. In Section [3] we construct an 
explicit algebra isomorphism. In Section [4] we apply this to the Terwilliger algebra 
of the binary Hamming scheme. 

This paper is of expository nature and probably few of the results are new. On 
the other hand a tutorial of how to use symmetry in semidefinite programming is 
not readily available. Furthermore our treatment of the Terwilliger algebra for bi- 
nary codes provides an alternative point of view which emphasizes the action of the 
symmetric group. Schrijver [20] treated the Terwilliger algebra with elementary 
combinatorial and linear algebraic arguments. Our derivation has the advantage 
that it gives an interpretation for the matrix entries in terms of Hahn polynomials. 
In a similar way one can derive the block diagonalization of the Terwilliger algebra 
for nonbinary codes which was computed by Gijswijt, Schrijver, Tanaka [11]. Here 
products of Hahn and Krawtchouk polynomials occur. 

2. Background and notation 

In this section we present the basic framework for simplifying a semidefinite 
program invariant under a group action. 

Let G be a finite group which acts on a finite set X by (a, x) \— ► ax with a G G 
and x G X. This group action extends to an action on pairs (x, y) G X x X 
by (a, (x, y)) h- > (ax, ay). In this way it extends to square matrices whose rows 
and columns are indexed by X: for an X x X-matrix M we have aM(x,y) = 
M(ax, ay). Here M{x, y) denotes the entry of M at position (x, y). A matrix M 
is called invariant under G if M = aM for all a G G. 

A Hermitian matrix Y G C XxX is called a feasible solution of (OQ) if it fulfills 
the conditions (At ,Y) = b\ and Y y 0. It is called an optimal solution if it is 
feasible and if for all other feasible solutions Y' we have (C, Y) > (C, Y'). In the 
following we assume that the semidefinite program (Q~|) has an optimal solution. 

We say that the semidefinite program (Q~|) is invariant under G if for every fea- 
sible solution Y and for every a G G the matrix aY is again a feasible solution 
and if it is satisfies (C, aY) = (C, Y) for all a G G. Because of the convexity of 
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CQ), one can find an optimal solution of (Q]) in the subspace B of matrices which are 
invariant under G. In fact, if Y is an optimal solution of (Q]), so is its group average 
JU\ SaGG a ^ ■ Hence, CD is equivalent to 

(2) max{(C, Y) : {A u Y) = h, i = 1, . . . , n, Y y 0, and Y G B}. 

The set X x X can be decomposed into the orbits R\ , . . . , i?;v by the action of G. 
For every r G {1, . . . , iV} we define the matrix B r G {0, l} XxX by 5 r (x, y) = 1 
if (x, y) G R r and B r (x, y) = otherwise. Then Si, ... , forms a basis of B. 
We call Bi, . . . , Bn the canonical basis of £>. If (x, y) G i? r we also write B\ x ^ 
instead of B r . Note that B\ y>x \ is the transpose of the matrix Bu-^y 

So the first step to simplify a semidefinite program which is invariant under a 
group is as follows: 

If the semidefinite program (Q~|) is invariant under G, then (Q]) is equivalent to 



maxjciyiH h c N y N : y-y, . . . , y N G C, 

m anyi H h aiAryAr = 6 i; i = 1, . . . , n, 

{> y 3 =VkifB 3 = {B k ) t 



yi5i + ••• + y N B N y o}, 



where c r = (C, B r ), and ai r = (A{, B r ). 

The following obvious property is crucial for the next step of simplifying ©I 
The subspace B is closed under matrix multiplication. So B is a (semisimple) 
algebra over the complex numbers. The Artin-Wedderburn theory (cf. iMTl Chapter 
1]) gives: 

There are numbers d, and mi, . . . , so that there is an algebra isomorphism 



(4) ip : B -> C mk 



k=l 



(5) 



This applied to ® gives the final step of simplifying £[]): 

semidefinite program (fl]) jj invariant under G, then (Q]) jj equivalent to 

max jciyi H h CAryAr : j/i, . . . ,y N G C, 

aiiyi H 1- a iN y N = h, i = l,...,n, 

Vj =Vk~ifB j = (B k y, 

yiipiBx) + ■ ■ ■ + y N <p(B N ) y o}. 



Notice that since ip is an algebra isomorphism between matrix algebras with 
unity, if preserves eigenvalues and hence positive semidefiniteness. In accordance 
to the literature, applying tp to a semidefinite program is called block diagonaliza- 
tion. 

The advantage of (f5]) is that instead of dealing with matrices of size \X\ x \X\ one 
has to deal with block diagonal matrices with d block matrices of size mi , . . . , m^, 
respectively. In many applications the sum mi + • • • + nid is much smaller than 
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\X\ and in particular many practical solvers take advantage of the block structure 
to speed up the numerical calculations. 

3. Determining a block diagonalization 

In this section we give an explicit construction of an algebra isomorphism (p. It 
has two main features: One can turn the construction into an algorithm as we show 
at the end of this section, and one can use it for symbolic calculations by hand as 
we demonstrate in Section [4] 

3.1. Construction. We begin with some basic notions from representation theory 
of finite groups. Consider the complex vector space C x of vectors indexed by X 
with inner product (/, g) = J2 x£X f( x )d( x )- The group G acts on C x by 
af(x) = f(a~ l x). Note that the inner product on C x is invariant under the group 
action: For all /, g G C x and all a G G we have (af,ag) = (/,<?)■ A subspace 
H C C x is called a G-space if GH C H where GH = {af : f G H,a G G}. 
It is called irreducible if the only proper subspace H' C H with GH 1 C H' 
is {0}. Two G-spaces H and H' are called equivalent if there is a G-isometry 
4> : H —> H', i.e. a linear isomorphism with <f){af) = acj)(f) for all f G H and 
a € G and («/>(/), 0(g)) = (/, 5) for all /, 5 G if. 

By Maschke's theorem (cf. lfl2l Theorem 2.4.1]) one can decompose C x or- 
thogonally into irreducible G-spaces: 

(6) C x = {H ltl ±...± H ltmi ) ±...± (H d>1 _L ... ± H d>md ), 

where with k = 1, . . . , d and i = 1, . . . , is an irreducible G-space of 
dimension hk and where Hk^ and -fffc'.i' are equivalent if and only if k = k'. 

Let A be the subalgebra of C XxX which is generated by the permutation matri- 
ces P a G C XxX with a G G where 



(V) F a (*,!/) 



1 if a : x = y, 
otherwise. 

Because of (O the algebra ^4 decomposes as a complex vector space in the follow- 
ing way 

d 

(8) A^($£. hkXhk ®I mk . 

k=l 

Recall that by B we denote the matrices in C XxX which are invariant under the 
group action of G. In other words, it is the commutant of A: 

B = Comm(i) = {B G C XxX : BA = AB for all A G A}. 

The double commutant theorem |[T2l Theorem 3.3.7] gives the following decom- 
position of B as a complex vector space: 

d 

(9) B^04 t ®C miXra '. 

k=l 
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Now we construct an explicit algebra isomorphism between the commutant al- 
gebra B and matrix algebra 0^ =1 C mfcXmfc . 

Let ejfc i j with I = 1, . . . , hk be an orthonormal basis of the space Hk : i- Choose 
G-isometries 4>hi : Hf. t i — > -fffc.i- Then, e^il = (j>ki( e k,il) is an orthonormal 
basis of Hf.,i. Define the matrix E^^j G C XxX with i, j = 1, . . . , rrik by 

Ek,i,j(x,y) = y^7^e k ^i{x)e k jj{y). 
' ' i=i 

The definition of these matrices depend on the choice of the orthonormal basis, 
on the chosen G-isometries and on the chosen decomposition ©. The following 
proposition shows the effect of different choices. 

Proposition 3.1. By E^{x, y) we denote the m k x m& matrix (E^^Ax, y))i,j- 

(a) The matrix entries E/,^j(x, y) do not depend on the choice of the orthonor- 
mal basis of H k \. 

(b) The change of to a</>&j with a E C, \a\ = 1, simultaneously changes 
the i-th row and i-th column in the matrix E^{x, y) by a multiplication with a and 
a, respectively. 

(c) The choice of another decomposition of Hp-i _L . . . _L H k rrik as a sum of 

rrik orthogonal, irreducible G-spaces changes Ek(x,y) to U E^ix^y)!] 1 for some 
unitary matrix U G U(C mfc ). 

Proof. This was proved in (2l Theorem 3. 1] with the only difference that there only 
the real case was considered. The complex case follows mutatis mutandis. □ 

The following theorem shows that the map 

d 

(10) ^:B^0C mfcXmt 

k=l 

mapping Ek,ij to the elementary matrix with the only non-zero entry 1 at position 
in the k-th summand C mfcXmfe of the direct sum is an algebra isomorphism. 

Theorem 3.2. The matrices E^^jform a basis ofB satisfying the equation 

(11) EkjjEfrjiji = 6k,k'dj,i'Ek,ij', 
where 5 denotes Kronecker's delta. 

Proof. The multiplication formula (fTTTt is a direct consequence of the orthonormal- 
ity of the vectors ek,i,i- That E^ij is an element of B follows from [2, Theorem 
3.1 (c)]. From (fTTTt it follows that the matrices Ef-^j sue linearly independent, they 

span a vector space of dimension Ylt=i m k- H ence » by ©, they form a basis of 
the commutant B. □ 
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Now the expansion of the canonical basis B r , with r = 1, . . . , N, in the basis 
Ek,i,j with coefficients p r {k, i,j) 

d m k 

(12) B r = ^2^2 Pr{k,i,j)E kjijj . 

k=l i,j=l 

yields 

d rn k 

(p(B r ) = ^2^2 p r (k,i,j)cp(E kjijj ). 

k=l i,j=l 

3.2. Orthogonality relation. For the computation of the coefficients p r (k,i,j) 
the following orthogonality relation is often helpful. 

If we expand the basis \X\E kj ij in the canonical basis B r we get a relation 
which after normalization is inverse to (PT2l) 



A' 



(13) \X\E kAj = J2<lk,i,j{r)B r . 



r=l 



So we have an orthogonality relation between the q k ,i,f 
Lemma 3.3. Letv T = \{(x,y) € X x X : (x,y) € Rr}\- Then, 



N 



(14) y^ v rqk,i,j(r)qk',i',j'(r) = 8 k ,k'5j,j'5i,i'\X\ 2 h k . 

r=l 

Proof. Consider the sum YlxeX E k,i,j E k',j',i'{ x i x). 
On the one hand it is equal to 

^ Sk,k'dj,j'E kiii i>(x,x) = 5 k:k '5jj/ trace E k ^>, 

and 

trace E k ^i = '^2( e k,i,h e k,i',i) = &i,i'h k , 
i=i 

On the other hand it is 

x N 

X] E k,i,j( x ^y) E k',j',i'(y^ x ) = T^T2^2 v rQk,i,j( r )Qk\i',f(r), 
x&X y€X ' ' r=l 



where we used the fact Ek'j'^'iy, x) = E^^^ix, y) which follows from the def- 
inition. □ 

The orthogonality relation gives a direct way to compute p r (k,i,j) once q k i (r) 
is known: We have 



(15) p r (k,i,j) 



Vrqk,i,j(r) 
\X\h k ' 
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which follows by Lemma [331 and by (fl2l and (fT3l because of 

N 

y)Pr(k,i,j)q k ',i>,j'(r) = \X\6 k ,k'8i,i'8j,j>- 

r=l 

3.3. Algorithmic issues. We conclude this section by reviewing algorithmic is- 
sues for computing (p. To calculate the isomorphism one has to perform the fol- 
lowing steps: 

(1) Compute the orthogonal decomposition © of C x into pairwise orthogo- 
nal, irreducible G-spaces H k i. 

(2) For every irreducible G-space H kt i determine an orthonormal basis. 

(3) Find G-isometries <f> ki : H kt i —>■ H^. 

(4) Express the basis B r in the basis E^ij. 

Only the first step requires an algorithm which is not classical. Here one can 
use an algorithm of Babai and Ronyai OX It is a randomized algorithm running 
in expected polynomial time for computing the orthogonal decomposition (©. It 
requires the permutation matrices P a given in (0 as input, where a runs through 
a (favorably small) generating set of G. The other steps can be carried out using 
Gram-Schmidt orthonormalization and solving systems of linear equations. 

4. Block diagonalization of the Terwilliger algebra 

The symmetric group S n acts on the set X = {0, l} n of binary vectors with 
length n by a(x±, . . . , x n ) = (x a n) , . . . , x a i n \), i.e. by permuting coordinates. In 
|[20l Schrijver determined the block diagonalization of the algebra B of X x X- 
matrices invariant under this group action. The algebra B is called the Terwilliger 
algebra of the binary Hamming scheme. Now we shall derive a block diagonaliza- 
tion in the framework of the previous section. In this case it is possible to work 
over the real numbers only because all irreducible representations of the symmetric 
group are real. 

Under the group action the set X splits into n + 1 orbits Xq, . . . , X n where X m 
contains the elements of {0, l} n having Hamming weight m, i.e. elements which 
one can get from the binary vector i m o n_m by permuting coordinates. So we have 
the orthogonal decomposition of the 5 n -space R x into 

R x = R x o J L R Xn . 

It is a classical fact (cf. ['8, Theorem 2.10]) that the SVspace R Xm decomposes 
further into 

^x m _ f flo.m J L H m , m , when < m < [n/2\, 

\ #o,m J -L H n -m,m, otherwise. 

where -Hfc,m are irreducible 5 n -spaces which correspond to the irreducible repre- 
sentation of S n given by the partition (n — k, k) (cf. lfl9l Chapter 2]). Its dimension 

i^ = ©-( fc -i)- " 

Thus, the matrices E^^j, with k = 0, . . . , [n/2\, which correspond to the iso- 
typic component H^^ _L . . . ± H^^n-k of K x of type (n — k, k) are conveniently 



8 



FRANK VALLENTIN 



indexed by i, j = k, . . . , n — k. Since E k j t i is the transpose of E k ^j we only need 
to consider the case k<i<j<n — k. 

To determine E k ^j{x, y) we rely on the papers O and of Dunkl. We recall 
the facts and notation which we will need from them. Let T k : S n — > 0(1^*) be 
an orthogonal, irreducible representation of S n given by the partition (n — k, k). 
By H, K we denote the subgroups H = Sj x S n -j and K = S{ x S n -i of S n . Let 
Vfe C M. Sn be the vector space spanned by the function (2&) rs , with 1 < r, s < /i^, 
which are the matrix entries of T k : (Tfc) rs (7r) = [Tjfc(7r)]rs- A function / G Vfc is 
called H-K-invariant if /(anr) = /(-zr) for all a G 7r G 5 n , r G K. In (U §4] 
and (H §4] Dunkl computed the i7-i<C-invariant functions of V k . These are all real 
multiples of 

^k,H-K{^) = r^rr- — ^Qk{v{TT); -(n -i) 
where (a)o = 1, (a) k = a(a + 1) . . . (a + k — 1), and where, 

^ i - - I ,- 4 - 1 , m ,^E,- 1 yGp(»:;)(;), 

are /fa/m polynomials (for integers m, a, 6 with a > m, 6 > m > 0), and where 

v(ir) = i- |vr{l,...,i}n{l,...,j}|. 

The polynomials Qu{x) = Qk{x; — a — 1, —6 — 1, m) are the orthogonal polynomi- 
als for the weight function (") ( m ? !_ a .), x = 0, 1, . . . , m, normalized by Q&(0) = 1. 
For more information about Hahn polynomials we refer to fl5l . 

We will need the square of the norm of ipk,H-K which is given in (9l before 
Proposition 2.7]: 

/, , \ ^fe,H-x(id) {-j)k(i-n)k 

\Wk,H-KiWk,H-K) — r — 7 — rr-r. rr • 

hk {-i)k{J ~ n) k h k 

Let ejfe i i, . . . , ek : i : h k be an orthonormal basis of iffc^. We get an orthogonal, 
irreducible representation T/^j : — > 0(M hfe ) by 

K(ek,i,l) = y~][T k ,i(Tr)]i' ,ie k ,i,i' ■ 
i'=i 

Consider the function 

^, i (7r)=S M)i (7r(l i n - i ),P0^'). 



This is an i^-K-invariant function because E^ij G B. It lies in V k because vector 
spaces spanned by matrix entries of two equivalent irreducible representations co- 
incide. Thus, Zk,i,j is a real multiple of ipk,H—K- By computing the squared norm 
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of Zfi i j we determine this multiple up to sign: 



7T6S n 

= ^y/w,,,i l ? (." -M-'O" 

Here we used that e^j i is an orthonormal basis of H^ i where the inner product is 

if,g) = ^ Exex, f( x )9(x). 

All diagonal entries belonging to Xj x Xj of E^jj coincide and all others are 
zero, so ffjEkj j(PO n ~^, PO 71- - 7 ') is the trace of E^jj which equals its rank hk- 

Hence, (zk,i,j > z k,i,j) = So we nave determined i^ij up to sign. 

To adjust the signs it is enough to ensure that the multiplication formula (fTTT) is 
satisfied. 

So putting it together, we have proved the following theorem. 

Theorem 4.1. For x, y G X define v(x, y) = \{l G {1, . . . , n} : X\ = 1, yi = 0}|. 
For k = 0, . . . , [n/2\ and i, j = k, . . . ,n — k with i < j we have 



Ek,i,jix,y) 



i 



(-j)k(i ~ n) k \ 2 



((i)©) 1/2 \HMj-nh 

Qk{v(x,y);-(n - i) - - 1, j), 

when x G Xj, y G Xj. f/ie case x tfL X; t or y Xj we have E)~ ! i ! j(x, y) = 0. 
Furthermore, E^ji = (-S^ »«•)*. 

Finally, to find the desired algebra isomorphism (@]) we determine the values of 
p r (k, by formula ( fl5T ). We represent the orbits R±, . . . , Rn by triples (r, s, d): 
Two pairs (x, y), (a/, y') G X x X are equivalent whenever x, x' G X r , y, y' G X s , 
and u(sc, y) = v(x f , y') = d. Then, 

n ■ -\ v rs dE kij (x,y) 
Pr, s ,d{k,i,j) = — ^ — , 

where 

v rsd =( n )( n ~ d )( 

r ' s4 ydjyr-djys-s + dj 

Remark 4.2. In a similar way one can give an interpretation of the block diago- 
nalization of the Terwilliger algebra for nonbinary codes which was computed in 
1111 . Using (H Theorem 4.2] one can show the matrix entries are, up to scaling 
factors, products ofHahn polynomials and Krawtchouk polynomials. 
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